Search CORE

23 research outputs found

EW-Tune: A Framework for Privately Fine-Tuning Large Language Models with Differential Privacy

Author: Behnia Rouzbeh
Ebrahimi Mohamamdreza
Pacheco Jason
Padmanabhan balaji
Publication venue
Publication date: 26/10/2022
Field of study

Pre-trained Large Language Models (LLMs) are an integral part of modern AI that have led to breakthrough performances in complex AI tasks. Major AI companies with expensive infrastructures are able to develop and train these large models with billions and millions of parameters from scratch. Third parties, researchers, and practitioners are increasingly adopting these pre-trained models and fine-tuning them on their private data to accomplish their downstream AI tasks. However, it has been shown that an adversary can extract/reconstruct the exact training samples from these LLMs, which can lead to revealing personally identifiable information. The issue has raised deep concerns about the privacy of LLMs. Differential privacy (DP) provides a rigorous framework that allows adding noise in the process of training or fine-tuning LLMs such that extracting the training data becomes infeasible (i.e., with a cryptographically small success probability). While the theoretical privacy guarantees offered in most extant studies assume learning models from scratch through many training iterations in an asymptotic setting, this assumption does not hold in fine-tuning scenarios in which the number of training iterations is significantly smaller. To address the gap, we present \ewtune, a DP framework for fine-tuning LLMs based on Edgeworth accountant with finite-sample privacy guarantees. Our results across four well-established natural language understanding (NLU) tasks show that while \ewtune~adds privacy guarantees to LLM fine-tuning process, it directly contributes to decreasing the induced noise to up to 5.6\% and improves the state-of-the-art LLMs performance by up to 1.1\% across all NLU tasks. We have open-sourced our implementations for wide adoption and public testing purposes.Comment: Accepted at IEEE ICDM Workshop on Machine Learning for Cybersecurity (MLC) 202

arXiv.org e-Print Archive

Efficient Secure Aggregation for Privacy-Preserving Federated Machine Learning

Author: Behnia Rouzbeh
Ebrahimi Mohammadreza
Hoang Thang
Padmanabhan Balaji
Riasi Arman
Publication venue
Publication date: 07/04/2023
Field of study

Federated learning introduces a novel approach to training machine learning (ML) models on distributed data while preserving user's data privacy. This is done by distributing the model to clients to perform training on their local data and computing the final model at a central server. To prevent any data leakage from the local model updates, various works with focus on secure aggregation for privacy preserving federated learning have been proposed. Despite their merits, most of the existing protocols still incur high communication and computation overhead on the participating entities and might not be optimized to efficiently handle the large update vectors for ML models. In this paper, we present E-seaML, a novel secure aggregation protocol with high communication and computation efficiency. E-seaML only requires one round of communication in the aggregation phase and it is up to 318x and 1224x faster for the user and the server (respectively) as compared to its most efficient counterpart. E-seaML also allows for efficiently verifying the integrity of the final model by allowing the aggregation server to generate a proof of honest aggregation for the participating users. This high efficiency and versatility is achieved by extending (and weakening) the assumption of the existing works on the set of honest parties (i.e., users) to a set of assisting nodes. Therefore, we assume a set of assisting nodes which assist the aggregation server in the aggregation process. We also discuss, given the minimal computation and communication overhead on the assisting nodes, how one could assume a set of rotating users to as assisting nodes in each iteration. We provide the open-sourced implementation of E-seaML for public verifiability and testing

arXiv.org e-Print Archive

Unsupervised Threat Hunting using Continuous Bag of Terms and Time (CBoTT)

Author: Agrawal Manish Kumar
Behnia Rouzbeh
Daniel Clinton
Kayhan Varol
Shivendu Shivendu
Publication venue: AIS Electronic Library (AISeL)
Publication date: 11/12/2023
Field of study

Threat hunting is sifting through system logs to detect malicious activities that might have bypassed existing security measures. It can be performed in several ways, one of which is based on detecting anomalies. We propose an unsupervised framework, called continuous bag-of-terms-and-time (CBoTT), and publish its application programming interface (API) to help researchers and cybersecurity analysts perform anomaly-based threat hunting among SIEM logs geared toward process auditing on endpoint devices. Analyses show that our framework consistently outperforms benchmark approaches. When logs are sorted by likelihood of being an anomaly (from most likely to least), our approach identifies anomalies at higher percentiles (between 1.82-6.46) while benchmark approaches identify the same anomalies at lower percentiles (between 3.25-80.92). This framework can be used by other researchers to conduct benchmark analyses and cybersecurity analysts to find anomalies in SIEM logs

AIS Electronic Library (AISeL)

MUSES: Efficient Multi-User Searchable Encrypted Database

Author: Jorge Guajardo
Rouzbeh Behnia
Thang Hoang
Tung Le
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 18/05/2023
Field of study

Searchable encrypted systems enable privacy-preserving keyword search on encrypted data. Symmetric Searchable Encryption (SSE) achieves high security (e.g., forward privacy) and efficiency (i.e., sublinear search), but it only supports single-user. Public Key Searchable Encryption (PEKS) supports multi-user settings, however, it suffers from inherent security limitations such as being vulnerable to keyword-guessing attacks and the lack of forward privacy. Recent work has combined SSE and PEKS to achieve the best of both worlds: support multi-user settings, provide forward privacy while having sublinear complexity. However, despite their elegant design, the existing hybrid scheme inherits some of the security limitations of the underlying paradigms (e.g., patterns leakage, keyword-guessing) and might not be suitable for certain applications due to costly public-key operations (e.g., bilinear pairing). In this paper, we propose MUSES, a new multi-user encrypted search scheme that addresses the limitations in the existing hybrid design, while offering user efficiency. Specifically, MUSES permits multi-user functionalities (reader/writer separation, permission revocation), prevents keyword-guessing attacks, protects search/result patterns, achieves forward/backward privacy, and features minimal user overhead. In MUSES, we demonstrate a unique incorporation of various state-of-the-art distributed cryptographic protocols including Distributed Point Function, Distributed PRF, and Secret-Shared Shuffle. We also introduce a new oblivious shuffle protocol for the general -party setting with dishonest majority, which can be of independent interest. Our experimental results indicated that the keyword search in our scheme is two orders of magnitude faster with 13× lower user bandwidth overhead than the state-of-the-art

Cryptology ePrint Archive

Compact Energy and Delay-Aware Authentication

Author: Attila A. Yavuz
Muslum Ozgur Ozmen
Rouzbeh Behnia
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 25/08/2018
Field of study

Authentication and integrity are fundamental security services that are critical for any viable system. However, some of the emerging systems (e.g., smart grids, aerial drones) are delay-sensitive, and therefore their safe and reliable operation requires delay-aware authentication mechanisms. Unfortunately, the current state-of-the-art authentication mechanisms either incur heavy computations or lack scalability for such large and distributed systems. Hence, there is a crucial need for digital signature schemes that can satisfy the requirements of delay-aware applications. In this paper, we propose a new digital signature scheme that we refer to as Compact Energy and Delay-aware Authentication (CEDA). In CEDA, signature generation and verification only require a small-constant number of multiplications and Pseudo Random Function (PRF) calls. Therefore, it achieves the lowest end-to-end delay among its counterparts. Our implementation results on an ARM processor and commodity hardware show that CEDA has the most efficient signature generation on both platforms, while offering a fast signature verification. Among its delay-aware counterparts, CEDA has a smaller private key with a constant-size signature. All these advantages are achieved with the cost of a larger public key. This is a highly favorable trade-off for applications wherein the verifier is not memory-limited. We open-sourced our implementation of CEDA to enable its broad testing and adaptation

Crossref

Cryptology ePrint Archive

Lattice-Based Public Key Searchable Encryption from Experimental Perspectives

Author: Attila A. Yavuz
Muslum Ozgur Ozmen
Rouzbeh Behnia
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 09/11/2018
Field of study

Public key Encryption with Keyword Search (PEKS) aims in mitigating the impacts of data privacy versus utilization dilemma by allowing {\em any user in the system} to send encrypted files to the server to be searched by a receiver. The receiver can retrieve the encrypted files containing specific keywords by providing the corresponding trapdoors of these keywords to the server. Despite their merits, the existing PEKS schemes introduce a high end-to-end delay that may hinder their adoption in practice. Moreover, they do not scale well for large security parameters and provide no post-quantum security promises. In this paper, we propose two novel lattice-based PEKS schemes that offer a high computational efficiency along with better security assurances than that of the existing alternatives. Specifically, our NTRU-PEKS scheme achieves 18 times lower end-to-end delay than the most efficient pairing-based alternatives. Our LWE-PEKS offers provable security in the standard model with a reduction to the worst-case lattice problems. We fully implemented our NTRU-PEKS scheme and benchmarked its performance as deployed on Amazon Web Services cloud infrastructures

Cryptology ePrint Archive

TACHYON: Fast Signatures from Compact Knapsack

Author: Attila A. Yavuz
Mike Rosulek
Muslum Ozgur Ozmen
Rouzbeh Behnia
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 12/12/2018
Field of study

We introduce a simple, yet efficient digital signature scheme which offers post-quantum security promise. Our scheme, named

\texttt{TACHYON}

, is based on a novel approach for extending one-time hash-based signatures to (polynomially bounded) many-time signatures, using the additively homomorphic properties of generalized compact knapsack functions. Our design permits

\texttt{TACHYON}

to achieve several key properties. First, its signing and verification algorithms are the fastest among its current counterparts with a higher level of security. This allows

\texttt{TACHYON}

to achieve the lowest end-to-end delay among its counterparts, while also making it suitable for resource-limited signers. Second, its private keys can be as small as

\kappa

bits, where

\kappa

is the desired security level. Third, unlike most of its lattice-based counterparts,

\texttt{TACHYON}

does not require any Gaussian sampling during signing, and therefore, is free from side-channel attacks targeting this process. We also explore various speed and storage trade-offs for

\texttt{TACHYON}

, thanks to its highly tunable parameters. Some of these trade-offs can speed up

\texttt{TACHYON}

signing in exchange for larger keys, thereby permitting

\texttt{TACHYON}

to further improve its end-to-end delay

Cryptology ePrint Archive

Efficient Post-Quantum and Compact Cryptographic Constructions for the Internet of Things

Author: Behnia Rouzbeh
Publication venue: Digital Commons @ University of South Florida
Publication date: 30/03/2021
Field of study

IoT systems often rely on low-end devices to send measurements to other parties and depending on the setting, unauthorized alteration and/or privacy violation of these measures can have catastrophic consequences (e.g., embedded medical sensors). Therefore, providing efficient authentication, integrity, and confidentiality in these settings is vital. While conventional cryptographic measures (e.g., ECDSA) can be used to meet these security requirements, despite their elegant design, they are often too computationally expensive for low-end devices. This is further exacerbated when security against quantum computers is taken into the account. In this dissertation, we propose a series of new efficient conventional and post-quantum cryptographic schemes to meet the stringent requirement of such IoT systems. In the line of proposing efficient authentication schemes, we propose two signature schemes. Our first signature scheme is based on conventional cryptographic problems and utilizes the message encoding with cover-free families and special property of ECDLP-based functions to achieve significant performance gain as compared to its counterparts. The second scheme is based on post-quantum primitives and is achieved by extending one-time signatures to (polynomially bounded) many-time signatures, using the additively homomorphic properties of generalized compact knapsack functions. The new scheme achieves the lowest end-to-end delay among its counterparts which makes it suitable for low-end devices. As a step toward a fully post-quantum blockchain, we propose a Proof of Work (PoW) protocol that minimizes the advantage of a quantum miner. Our new protocol is based on the Hermite Shortest Vector Problem (Hermite-SVP) in the Euclidean norm and allows for a fast verify algorithm. To alleviate the hurdle of certificate communication and verification for low-end devices, we then present an identity-based and certificateless cryptosystems that are created using special key generation algorithms that harness the additive homomorphic property of the exponents to enable the users to incorporate their private keys into the one provided by the trusted third party without falsifying it. The new schemes achieve better computation efficiency and comparable communication efficiency as compared to their identity-based and certificateless counterparts. Lastly, with the aim of proposing efficient and highly-secure measures for secure remote data storage, we propose two lattice-based public key searchable encryption schemes with post-quantum security. To our knowledge, our schemes are the first instances of such schemes based on lattices that provide a post-quantum promise. Our first variant is based on NTRU lattices and provides a significant performance advantage and better end-to-end delay as compared to its existing counterparts. The second scheme, based on the LWE problem in the standard model, provides a better security as compared to its counterparts with a cost of an inferior performance. All of the proposed schemes are proven secure via rigorous security proofs and are implemented and open-sourced to allow for public testing and verification

USFSP Digital Archive

Scholar Commons - University of South Florida